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"VIDEO ENCODING METHOD AND CORRESPONDING BICODING AND DECODING DEVICES" 

FIELD OF THE INVENTION 

The present Invention relates to the field of video compression and, for Instance, 
to the video coding standards of the MPEG family (MPEG-1, MPEG-2, MPEG-4) and the ITU- 
H.26X family (H.261, H.263 and extensions, H.26L). More specifically, this Invention concerns an 
encoding method applied to a video sequence corresponding to successive scenes subdivided 
into successive video object planes (VOPs) and generating, for coding all the video objects of 
said scenes, a coded bltstream constituted of encoded video data In which each data Item Is 
described by means of a bltstream syntax allowing to recognize and decode all the elements of 
the content of said bltstream, said content being described in terms of separate channels. 

The Invention also relates to a corresponding encoding device, to a transmittable 
vfcieo signal consisting of a coded bltstream generated by such an encoding device, and to a 
device for receiving and decoding a video signal consisting of such a coded bltstream. 

BACKGROUND OF THE INVENTION 

In the first video coding standards (up to MPEG-2 and H.263), the video is 
assumed to be rectangular and to be described (n terms of a luminance channel and two 
chrominance channels. With MPEG-4, other channels have been Introduced, the spatial 
resolution of which Is described at the sequence level (Video Object Layer, or VOL, In MPEG-4 
terminology), as defined in the MPE6-4 document w3056, ''Information Technology 
- Coding of audfo-vlsual objects - Part 2 : Visual", ISOAEC/Jrci/SC29/WGll, Maul, USA, 
December 1999. Only one description Is given for ail channels. The standard defines the 
'VIdeojobjeclLlayer_wldth''and "vfdeojobJectLteyer^helght" syntax elements (w3056, p.36 and 
p.113), which are 13-blt unsigned integers representing the width and height of the displayabie 
part of the luminance component in pixel units. From this values, the actual spatial resolution of 
the different channels is Inferred as follows: 

- the luminance channel spatial resolution Is width x height; 

- the shape channel spatial resolution is also width x he^ht; 

- the chrominance channels spatial resolution fs (wldth/2) x (helght/2). 

MPEG-4 also defines the so-called reduced resolution VOP tool. When this tool Is used , the size 
of the macroblock used for motion compensation decoding Is 32 x 32 pixels and the size of 
block IS 16 X 16 pixels. It corresponds to the encoding of quarter resolution pictures (decimated 
by a factor of 2 vertically and horizontally) at the encoding side. The decoded pictures are then 
up sampled to the nomial resolution (width x height) at the decoding side. The standard has 
also additional syntax elements. A one bit-flag ^reduced^resolution^vopjenable'', found at the 
VOL level (w3056, p.38 and p.ll8), indicates that the '^Dynamic Resolution C0nversion"(DRC) 
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tx)ol is enabled when set to '1'. In such a case, the single bit flag ''vop.reduced^nesolution'^ has 
to be retrieved from every VOP header (w3056, p,41, p.47 and p.l21). It signals whether the 
VOP is encoded at spatially reduced resolution or not. When this flag Is set to '1', the VDP is 
encoded spatially reduced resolution and referred as Reduced Resolution VOP. When this flag is 
5 set to '*0'' or this flag is not present, the VOP Is encoded in normal spatial resolution and shall 

be decoded by the normal decoding process. From these remarks. It can be seen that the 
spatial resolution of the picture is desaibed at the VOP level, and unfortunately, ail channels 
have to share the same description. 

SUMMARY OF THE INVENTION 

10 It is therefore an object of the invention to propose a video coding method 

allowing to describe a video sequence with channels that have different resolutions. 

To this end, the invention relates to a method such as defined in the introductory 

part of the description and whldi is moreover characterized in that said syntax comprises 

specific syntactic means for separately describing tiie spatial resolution of eadi chanriel. 
15 The proposed solution, allowing to describe a video sequence with separate 

channels that have different characteristics, leads to a greater flecibility In digital video coding 

systems, such as the future H.26L standard. 

In a more flexible solution, said syntactic means may even comprise, for each 

channel, specific syntactic elements for separately describing the spatial resolution of each 
20 image of the sequence (this solution may be optional), and this description may be given, for 

the current image of the input sequence, with respect to the spatial resolution of the previous 

image in the same channel. 

For each channel and for each current image, said spatial resolution may moreover 

be described with respect to a reference (or nominal) spati'al resolution, which Is for iiistance a 
25 predetermined spatial resolution Indicated at the beginning of the bitstream, or the spatial 

resolution of one of the channels. The spatial resolution will be preferably described by means 

of a division or a multiplication of said reference spatial resolution. 
The Invention also relates to a device for encoding a video sequence corresponding 

to successive scenes subdivided into successive video object planes (VOPs), said device 
30 comprising means for structuring each scene of said sequence as a composition of video objects 

(VOs), means for coding the shape, the motion and the texture of each of said VOs, and means 

for multiplexing the coded elementary streams thus obtained into a single coded bitstream 

constituted of encoded video data in which each data item is described by means of a bitsti-eam 

syntax allowing to recognize and decode all the elements of the content of said bitstream, said 
35 content being described in terms of separate channels, said device being further characterized 

In that said multiplexing means comprise means for introducing Into said single bitstream a 

specific Infomnation for separatdy describing the spatial resoluti'on of each of said separate 

channels. 
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The invention also relates to a transmittable video signal consisting of a coded 
bltstream generated by an encoding mettiod applied to a sequence corresponding to successive 
scenes subdivided into successive video object planes (VOPs), said coded bitsbeam, generated 
for oxling all the video objects of said scenes, being constituted of encoded video data In which 
each data item is described by means of a bltstream syntax allowing to recognize and decode 
all the elements of the content of said bltstream, said content being described In terms of 
separate channels, said signal being further characterized in that It includes a specific 
information for separately describing the spatial resolution of each of said separate channels. 

The invention finally relates to a device for receiving and decoding a video signal 
consisting of a coded bltstream generated by an encoding method applied to a video sequence 
corresponding to successive scenes subdivided into successive video object planes (VOPs), said 
coded bltstream, generated for coding all the video obi&Ss of said scenes, being constituted of 
encoded video data in which each data Kern is described by means of a bltstream syntax 
allowing to recognize and decode ail the elements of the content cf said bitsb^m, said content 
bdng descrit>ed In terms of separate channels, and moreover comprising a speaTu: information 
for separately describing the spatial resolution of each of said separate channels, said decoding 
device being further characterized in that it Includes means for reading In the received coded 
bltstream the specific spatial resolution of each of said separate channels. 

DETAILED DESCRIPTION OF THE INVENTION 

As said above. It Is not possible, at that moment, to describe a video sequence with 
channels that have different resolutions. For instance. Instead of having the classical quarter 
spatial resolution for the chrominance channels (decimated by a factor 2 In each direction), due 
to bitrate constraints, one could imagine to have a 9th resolution chrominance channels 
(dedmated by a factor 3 In each direction). The solutions proposed here provide some syntax 
elements to support the lack of flexibility of current stendards (to offer also more fl^bllity for 
future stendards, the solution is extended to different channels, other than the luminance and 
chrominance ones, and proposes the reduced resolution channel tool). 

In the following, it is assumed that the presence of channels is described by 
several syntax elements at the sequence level (VOL in MPEG-4 terminology), for instence as: 

Channels presence descriptfon: 



Video_objedLlayerJum Ibit 

Videojobject Jayer^chrom 1 bit (0 for black and white) 

Videojobjectjayer^shape 1 bit (0 for rectangular) 

number_of_additionaLchannels 4 bits 

video_objectJayer_additIonaLchannel[0] 1 bit 

vIdeo_objectJayerjaddltIonaljchannel[l] 1 bit 
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videojobjecUayer.acklitionaLchannelD] 1 bit 

These syntax elements should be read as follows: 

- if "Vldeo.objedLlayerJum" Is 1, It means that the bitstream 
5 contains syntax elements for a luminance diannel ; 

- if '^VIdeo_objectJayer.chrom'' is 1, the bitstream contains syntax 
elements for the chrominance channels, else the sequence Is assumed to be black and white ; 

- if ''VIdeo_objectJayer_shape" Is 1, the bitstream contains syntax 
elements to desaibe a non-rectangular shape for the picture^ else It Is assumed to 

10 be rectangular ; 

- if ''number_of_addltionaLchannels" Is not zero, the bitstream contains syntax 
elements describing additional channels, which presence or not is 

described by vldeo_objedJayer_addltIonaLchannel[ri syntax element 

The following flags and syntax elements (In Italic) are proposed to describe the 
15 spatJal resolution and the availability of the reduced resolution tool of every channel. The basic 

idea is to start firom a nominal resolution (the maximum resolution of all channels) and to 
express the spatial resolution of every channel in terms of ratios of this nominal size. 
At sequence high level description (equivalent to VOL MPEG-4 level), the following syntax 
elements are proposed: : 

20 



Table 1 



Element 


TVoe 1 Semantic 


iyplcai 


1 for Claim 1 


VoLhoriz^sampUngu^ements^lum 


Unsigned 
integer 


Width of luminance channel In pixels 




Unsigned 
Integer 


Height of luminance channel in pixels 


Vol horiz sampllngLMBments chann^sni 


Unsigned 


Width of the i^ additional channel 




integer 




VoL vertL.sampnng_elements^€Jiiannels[l] 


Unsigned 
Integer 


Height of the i^ additional channel 


iypicai 


^ for Claim 2 


Vop^hortai^reducedLresolutioiiulum 


Ibit 


Use the horizontal reduced resolution tool 
on.the luminance channel 


Vop^vetii^reducedjresolution^lum 


Iblt 


Use the vertical reduced resolution tool on 
the luminance channel 


VopJhorlz_redU€^d_resolution_channels[I] 


Ibit 


Use the horizontal reduced resolution tool 
on the 1* additional channel 


Vop^vertureducedjresolutlon_ channeis[ij 


Ibit 


Use the vertical reduced resolution tool 
on the i*'' additional channel 


tvplcal for Claim 3 


Vol horiz reduced r^olutSon turn enable I 1 bit I Enable the horizontal reduced resolution 
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tool on the luminance channel 




Ibit 


Enable the vertical reduced resolution tool 
on the luminance channel 


Vol horiz reducedLresolMon^channels^enab! 

ejjl 


Ibit 


Enable the horizontal reduced resolution 
tool on the l*** additional channel 


Vol^vert_reduced^resolution^channels_enable 

m 


Ibit 


Enable the vertical reduced resolution 
tool on tfie additional diannel 


tmlcal for Claim 6 


Vol horiz sampling elements 


13 bits 


Horizontal nominal size f Dixels') 


Vol vert samoltna elements 


13 bits 


Vertical nominal size Cpbcets) 


typical for Claim < 


S 


Voljhoriz^sampling^resolutionjium^ratlo 


2 bits 


Ratio between horizontal nominal size and 
luminance horizontal size 


Voi^ vert_sampling_resolution_^lum^raiio 


2 bits 


Ratio between vertical nominal size and 
luminance vertical size 


Vol ttoriz sampling^resotittSon^i^annels^raffo 

nr 


2 bits 


Ration between horizontal nominal size 
and i^ additional channel horizontal size 


Vol verlLjsampiingure^lutJon^dhanneis^^rami 
'1 


2 bits 


Ration between vertical nominal size 
and 1^ additional channel vertical ske 



The invention is obviously not limited tx> the encoding method thus defined . It also 
relates to a device for encoding a video sequence con^pondlng to successive scenes 
subdivided Into successive video object planes (VOPs), said device comprising means for 
5 structuring each scene of said sequence as a composition of video objects (VOs), means for 

coding the shape, the motion and the texture of each of said VOs, and means for multiplexing 
the coded elementary streams thus obtained into a single coded bitstream constituted of 
encoded video data in which each data item Is described by means of a bitstream syntax, 
allowing to recognize and decode all the elements of the content of said bitstream, said content 

10 being described in terms of separate channels, said device being further diaracterized in that 

said multiplexing means comprise means for Introdudng into said single bitstream a specific 
information for separately describing the spatial resolution of each of said separate channels. 

The invention also relates to a transmltiable video signal consisting of a coded 
bitstream generated by an encoding metfiod applied to a sequence con«ponding to successive 

15 scenes subdivided Into successive video object planes (VOPs), saW coded bitstream, generated 

for coding all the video objects of said scenes, being constituted of encoded video data in which 
each data item Is described by means of a bitstream syntax allowing to recognize and decode 
all the elements of the content of said bitstream, said content being desaibed In tenms of 
separate diannels, said signal being further characterized in that it includes a specific 

20 information for separately describing the spab"al resolution of each of said separate channels. 

The invention finally relates to a device for receiving and decoding a video signal 
consisting of a coded bitstream generated by an encoding method applied to a video sequence 
corresponding to successive scenes subdivided Into successive video object planes (VOPs), said 
coded bitstream, generated for coding all tiie video objects of said scenes, being constituted of 

25 encoded video data In which each data item is described by means of a bitstream syntax 

allowing to recognize and decode all the elements of the content of said bitstream, said content 
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being described In terms of separate channels, and moreover comprising a specific information 
for separately describing tiie spatial resolution of each of safd separate channels, said decoding 
device being further characterized in that it includes means for reading In the received coded 
bitstream the specific spatial resolution of each of said separate channels. 
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CLAIMS: 

An encoding m^od applied to an input video sequence corresponding to 
successive scenes sulxJIvIded Into successive video object planes (VOPs) and generating, for 
coding all tiie video objects of said scenes, a coded bitstream constituted of encoded video data 
in wlildi eacli data item is described by means of a bitstream syntax allowing to recognize and 
decode all the elements of the content of said bitstream, said content being described in terms 
of separate channels, said method being further characterized in that said syntax comprises 
spedfic syntactic means for separately desaibing the spatial resolution of each channel. 

2. A method according to claim 1, characterized in that said syntactic means 
comprise, for each channel, spedfic syntactic elements for separately describing the spatial 
resolution of each image of the input sequence. 

3. A method according to daim 2, characterized In that said separate description of 
the spatial resolution of each image of the Input sequence Is optional. 

4. A method according to anyone of daims 2 and 3, characterized In that;, for each 
channel, said syntach'c means comprise syntactic elements for desalblng the spatial resolution 
of the current Image of the Input sequence with respect to the spatial resolution of the previous 
image in the same channel. 

5. A method according to anyone of daims 2 to 4, characterized in that, for 

each channel and for each Image, the spatial resolution is described with respect to a reference 
spatial resolution. 

6. A method according to claim 5, characterized in that said reference spatial 
resolution is a predetemiined spatial resolution Indicated at the beginning of the bitstream. 

7. A method according to claim 5, characterized In that said reference spatial 
resolution Is the spatial resolution of one of the channels. 

8. A method according to anyone of claims 5 to 7, characterized In that the 
spatial r^ution is described by means of a division of said predeterminal reference spatial 
resolution. 

9. A method according to anyone of daims 5 to 7, charadsrized in that the spatial 
i^olution is described by means of a multiplication of said predebemnlned rdierence spatial 
resolution. 

10. A device for encoding a video sequence oinresponding to successive scenes 
subdivided Into successive video object planes (VOPs), said device comprising means for 
structuring each scene of said sequence as a composition of video objects (VOs), means for 
coding the shape, the motion and the texture of each of said VOs, and means for multiplexing 
the coded elementary streams thus obtained into a single coded bitstream constituted of 
encoded video data in which each data item is described by means of a bitstream syntax 
allowing to recognize and decode all the elements of the content of said bitsbream, said content 
being described in temis of separate channels, said device being furtlier charaderlzed In that 
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said multiplexing means cx3mprlse means for Introducing Into said single bitstream a specific 
infomiation for separately describing the spatial resolution of each of said separate channels. 

11. A transmittable video signal consisting of a coded bitstream generated by an 
encoding method applied to a sequence corresponding to succ^sive scenes subdivided into 
successive video object pfanes (VOPs), said coded bitstream, generated for coding all the video 
objects of said scenes, being constituted of encoded video data in which each data item is 
described by means of a bitstream syntax allowing to recognize and decode all the elements of 
the content of said bitstream, said content being described in terms of separate channels, said 
signal being further characterized in that it includes a specific information for separately 
describing the spatial resolution of each of said separate channels. 

12. A device for receiving and decoding a video signal consisting of a coded bitstream 
generated by an encoding method applied to a video sequence corresponding to successive 
scenes subdivided Into successive video object planes (VOPs), said coded bitstream, generated 
for coding all the video objects of said scenes, being constituted of encoded video data in which 
each data item is described by means of a bitstream syntax allowing to recognize and decode 
all the elements of die content of said bitsb-eam, said content being described in terms of 
separate channels, and moreover comprising a specific information for separately describing the 
spatial resolution of each of said separate channels, said decoding device being further 
characterized in that it Includes means for reading in the received coded bitstream the specific 
spatial resolution of each of said separate channels. 
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Abstract 

The invention relates to an encoding method appiied to a vfdeo sequence 
conBspondlng to successive scenes and generating a coded bitstream in which each data item 
is described by means of a bitstream syntax allowing, at the decoding side, to recognize and 

5 decode all the elements of the content of this coded bitstream. According to the invention, said 

syntax comprises specific syntactic means for separately describing the spatial resolution of 
each channel or, for each channel, the spatial resolution of each Image of the input sequence. 
I^oreover, said description may be done with respect to a reference spatial resolution, which 
may be either an absolute nominal spatial resolution or the spatial resolution of one of the 

10 channels. The Invention also relates to the corresponding encoding device, transmittable video 

signal and decoding device. 
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